IEEE INFOCOM 2022

Program at a Glance

IEEE INFOCOM 2022

Session Break-1-May4: TII Virtual Booth Session Break-2-May4: Virtual Lunch Break Session Break-3-May4: Virtual Coffee Break Session Break-4-May4: Virtual Dinner Break

Session A-3: Privacy Session A-6: Mobile Security

Session B-3: Cloud Session B-6: Edge Computing

Session C-3: Learning and Prediction Session C-6: Learning at the Edge

Session D-3: RFID Applications Session D-5: Mobile Applications 1 Session D-6: Mobile Applications 2

Session E-3: Policy and Rules (New) Session E-4: Pricing Session E-5: AoI Session E-6: QoE (New)

Session F-3: Scheduling 1 Session F-4: Scheduling 2 Session F-5: Caching Session F-6: Low Latency

Session G-3: 5G and mmW Networks Session G-4: Algorithms 1 Session G-5: Algorithms 2 Session G-6: Algorithms 3

Session Panel: Panel

Session Award: A Reflection with INFOCOM Achievement Award Winner

Session Poster-1: Poster: Machine Learning for Networking Session Poster-2: Poster: Wireless Systems and IoT Session Poster-3: Poster: Sensing and Localization Session Poster-4: Poster: Security and Analytics

Session D-3

RFID Applications

Conference

10:00 AM — 11:30 AM EDT

Local

May 4 Wed, 7:00 AM — 8:30 AM PDT

Encoding based Range Detection in Commodity RFID Systems

Xi Yu and Jia Liu (Nanjing University, China); Shigeng Zhang (Central South University, China); Xingyu Chen, Xu Zhang and Lijun Chen (Nanjing University, China)

RFID technologies have been widely used for item-level object monitoring and tracking in industrial applications. In this paper, we study the problem of range detection in a commodity RFID system, which aims to quickly figure out whether there are any target tags that hold specific data between a lower and upper boundary. This is important to help users pinpoint tagged objects of interest (if any) and give an early warning for reducing the potential risk, e.g., temperature monitoring for fire safety. We propose a time-efficient protocol called encoding range query (EnRQ). The basic idea is to use a sparse vector to separate target tags from the others with a few select commands. The sparse vector is specifically designed by encoding the tag's data based on notational systems. We implement EnRQ in commodity RFID systems with no need for any hardware modifications. Extensive experiments show that EnRQ can improve the time efficiency by more than 40% on average, compared with the state-of-the-art.

RC6D: An RFID and CV Fusion System for Real-time 6D Object Pose Estimation

Bojun Zhang (TianJin University, China); Mengning Li (Shanghai Jiao Tong University); Xin Xie (Hong Kong Polytechnic University, Hong Kong); Luoyi Fu (Shanghai Jiao Tong University, China); Xinyu Tong and Xiulong Liu (Tianjin University, China)

This paper studies the problem of 6D pose estimation, which is practically important in various application scenarios such as robotic-based object grasping, autonomous driving scene, and object integration in mixed reality. However, existing methods suffer from at least one of five major limitations: dependence on object identification, complex deployment, difficulty in data collection, low accuracy, and incomplete estimation. This paper proposes an RC6D system, which is the first to estimate 6D poses by fusing RFID and Computer Vision (CV) data with multi-modal deep learning techniques. In RC6D, we first detect 2D keypoints through a deep learning approach. We then propose a novel RFID-CV fusion neural network to predict the depth of the scene, and use the estimated depth information to expand the 2D keypoints to 3D keypoints. Finally, we model the coordinate correspondences between the 2D-3D keypoints, which is applied to estimate the 6D pose of target object. The experimental results show that the localization error of RC6D is less than 10cm with a probability higher than 90.64% and its orientation estimation error is less than 10degree with a probability higher than 79.63%. Hence, the proposed RC6D system performs much better than the state-of-the-art related solutions.

RCID: Fingerprinting Passive RFID Tags via Wideband Backscatter

Jiawei Li, Ang Li, Dianqi Han and Yan Zhang (Arizona State University, USA); Tao Li (Indiana University-Purdue University Indianapolis, USA); Yanchao Zhang (Arizona State University, USA)

Tag cloning and spoofing pose great challenges to RFID applications. This paper presents the design and evaluation of RCID, a novel system to fingerprint RFID tags based on the unique reflection coefficient of each tag circuit. Based on a novel OFDM-based fingerprint collector, our system can quickly acquire and verify each tag's RCID fingerprints which are independent of the RFID reader and measurement environment. Our system applies to COTS RFID tags and readers after a firmware update at the reader. Extensive prototyped experiments on 600 tags confirm that RCID is highly secure with the authentication accuracy up to 97.15% and the median authentication error rate equal to 1.49%. RCID is also highly usable because it only takes about 8s to enroll a tag and 2ms to verify an RCID fingerprint with a fully connected multi-class neural network. Finally, empirical studies demonstrate that the entropy of an RCID fingerprint is about 202 bits over a bandwidth of 20 MHz in contrast to the best prior result of 17 bits, thus offering strong theoretical resilience to RFID cloning and spoofing.

Revisiting RFID Missing Tag Identification

Kanghuai Liu (SYSU, China); Lin Chen (Sun Yat-sen University, China); Junyi Huang and Shiyuan Liu (SYSU, China); Jihong Yu (Beijing Institute of Technology/ Simon Fraser University, China)

We revisit the problem of missing tag identification in RFID networks. We make three contributions. Firstly, we quantitatively compare and gauge the existing propositions spanning over a decade on missing tag identification. We show that the expected execution time of the best solution in the literature is .\Omega \left(N+\frac{(1-\alpha)^2(1-\delta)^2}{ \epsilon^2}\right)., where .\alpha, \delta., and .\epsilon. are parameters quantifying the required detection accuracy, $N$ denotes the number of tags in the system. Secondly, we analytically establish the expected execution time lower-bound for any missing tag identification algorithm as .\Omega\left(\frac{N}{\log N}+\frac{(1-\delta)^2(1-\alpha)^2}{\epsilon^2 \log \frac{(1-\delta)(1-\alpha)}{\epsilon}}\right)., thus giving the theoretical performance limit. Thirdly, we develop a novel missing tag identification algorithm
by leveraging a tree-based structure with the expected execution time of .\Omega \left(\frac{\log\log N}{\log N}N+\frac{(1-\alpha)^2(1-\delta)^2}{ \epsilon^2}\right)., reducing the time overhead by a factor of up to .\log N. over the best algorithm in the literature. The key technicality in our design is a novel data structure termed as collision-partition tree (CPT), built upon a subset of bits in tag pseudo-IDs leading to more balanced tree structure and hence reducing the time complexity in parsing the entire tree.

Session Chair

Song Min Kim (KAIST)

Session D-5

Mobile Applications 1

Conference

2:30 PM — 4:00 PM EDT

Local

May 4 Wed, 11:30 AM — 1:00 PM PDT

DeepEar: Sound Localization with Binaural Microphones

Qiang Yang and Yuanqing Zheng (The Hong Kong Polytechnic University, Hong Kong)

Binaural microphones, referring to two microphones with artificial human-shaped ears, are pervasively used in humanoid robots for decorative purposes as well as improving sound quality. In many applications, it is crucial for such robots to interact with humans by finding the voice direction. However, sound source localization with binaural microphones remains challenging, especially in multi-source scenarios. Prior works utilize microphone arrays to deal with the multi-source localization problem. Extra arrays yet incur higher deployment cost and take up more space. However, human brains have evolved to locate multiple sound sources with only two ears. Inspired by this fact, we propose DeepEar, a binaural microphones-based localization system that can locate multiple sounds. To this end, we develop a neural network to mimic the acoustic signal processing pipeline of the human auditory system.
Different from hand-crafted features used in prior works, DeepEar can automatically extract useful features for localization. More importantly, the trained neural networks can be extended and adapt to new environments with a minimum amount of extra training data. Experiment results show that DeepEar can substantially outperform a state-of-the-art deep learning approach, with a sound detection accuracy of 93.3% and an azimuth estimation error of 7.4 degrees in multi-source scenarios.

Impact of Later-Stages COVID-19 Response Measures on Spatiotemporal Mobile Service Usage

André Felipe Zanella, Orlando E. Martínez-Durive and Sachit Mishra (IMDEA Networks Institute, Spain); Zbigniew Smoreda (Orange Labs & France Telecom Group, France); Marco Fiore (IMDEA Networks Institute, Spain)

The COVID-19 pandemic has affected our lives and how we use network infrastructures in an unprecedented way. While early studies have started shedding light on the link between COVID-19 containment measures and mobile network traffic, we presently lack a clear understanding of the implications of the virus outbreak, and of our reaction to it, on the usage of mobile apps. We contribute to closing this gap, by investigating how the spatiotemporal usage of mobile services has evolved through different response measures enacted in France during a continued seven-month period in 2020 and 2021. Our work complements previous studies in several ways: (i) it delves into individual service dynamics, whereas previous studies have not gone beyond broad service categories; (ii) it encompasses different types of containment strategies, allowing to observe their diverse effects on mobile traffic; (iii) it covers both spatial and temporal behaviors, providing a comprehensive view on the phenomenon. These elements of novelty let us lay new insights on how the demands for hundreds of different mobile services are reacting to the new environment set forth by the pandemics.

SAH: Fine-grained RFID Localization with Antenna Calibration

Xu Zhang, Jia Liu, Xingyu Chen, Wenjie Li and Lijun Chen (Nanjing University, China)

Radio frequency identification (RFID) based localization has attracted increasing attentions due to competitive advantages of RFID tags: unique identification, low-cost, and battery-free. Although many advanced phase-based localization methods are proposed, few of them take fully the unknown phase center (PC) and the phase offset (PO) into account, which however are the key factors in fine-grained localization. In this paper, we propose a novel localization algorithm called Segment Aligned Hologram (SAH) that jointly calibrates the PC and the PO. More specifically, SAH first builds a phase matrix and then designs a phase alignment algorithm based on the phase matrix for reducing the multipath effect. With a clean phase profile, SAH constructs a hologram for calibration and localization, which greatly reduces the system errors. We implement SAH through commercial RFID devices. Extensive experiments show that SAH can achieve a mm-level accuracy in both the lateral and radial directions with only a single antenna.

Separating Voices from Multiple Sound Sources using 2D Microphone Array

Xinran Lu, Lei Xie and Fang Wang (Nanjing University, China); Tao Gu (Macquarie University, Australia); Chuyu Wang, Wei Wang and Sanglu Lu (Nanjing University, China)

Voice assistant has been widely used for human-computer interaction and automatic meeting minutes. However, for multiple sound sources, the performance of speech recognition in voice assistant decreases dramatically. Therefore, it is crucial to separate multiple voices efficiently for an effective voice assistant application in multi-user scenarios. In this paper, we present a novel voice separation system using a 2D microphone array in multiple sound source scenarios. Specifically, we propose a spatial filtering-based method to iteratively estimate the Angle of Arrival (AoA) of each sound source and separate the voice signals with adaptive beamforming. We use BeamForming-based cross-Correlation (BF-Correlation) to accurately assess the performance of beamforming and automatically optimize the voice separation in the iterative framework. Different from general cross-correlation, BF-Correlation further performs cross-correlation among the after-beamforming voice signals processed with each linear microphone array. In this way, the mutual interference from voice signals out of the specified direction can be effectively suppressed or mitigated via the spatial filtering technique. We implement a prototype system and evaluate its performance in real environments. Experimental results show that the average AoA error is 1.4 degree and the average ratio of automatic speech recognition accuracy is 90.2% in the presence of three sound sources.

Session Chair

Zhichao Cao (Michigan State University)

Session D-6

Mobile Applications 2

Conference

4:30 PM — 6:00 PM EDT

Local

May 4 Wed, 1:30 PM — 3:00 PM PDT

An RFID and Computer Vision Fusion System for Book Inventory using Mobile Robot

Jiuwu Zhang and Xiulong Liu (Tianjin University, China); Tao Gu (Macquarie University, Australia); Bojun Zhang (TianJin University, China); Dongdong Liu, Zijuan Liu and Keqiu Li (Tianjin University, China)

Mobile robot-assisted book inventory such as book identification and book order detection has become increasingly popular, replacing manual book inventory which is time-consuming and error-prone. Existing systems are either computer vision (CV)-based or RFID-based, however several limitations are inevitable. CV-based systems cannot identify books effectively due to low accuracy of detection. RFID tags attached to book spines can be used to identify a book uniquely. However, in tag dense scenarios, coupling effects seriously affects the accuracy of reading. To overcome these limitations, this paper presents a novel RFID and CV fusion system for book inventory using mobile robot. RFID and CV are first used individually to obtain book order, then the information will be fused based on sequence based algorithm. Specifically, we address three technical challenges. We design a deep neural network (DNN) with multiple inputs and mixed data to filter out unrelated tags, propose video information extracting schema to extract information accurately, and use strong link to align and match RFID- and CV-based timestamp vs. book-name sequences to avoid errors during fusion. Extensive experiments indicate that our system achieves an average accuracy of 98.4% for tier filtering and an average accuracy of 98.9% for book order, outperforming the state-of-the-arts.

GASLA: Enhancing the Applicability of Sign Language Translation

Jiao Li, Yang Liu, Weitao Xu and Zhenjiang Li (City University of Hong Kong, Hong Kong)

This paper studies an important yet overlooked applicability issue in existing American sign language (ASL) translation systems. With excessive sensing data collected for each ASL word already, current designs treat every to-be-recognized sentence as new and collect their sensing data from scratch, while the amounts of sentences and the data samples per sentence are large usually. It takes a long time to complete the data collection for each single user, e.g., hours to a half day, which brings non-trivial burden to the end users inevitably and prevents the broader adoption of the ASL systems in practice. In this paper, we figure out the reason causing this issue. We present GASLA atop the wearable sensors to instrument our design. With GASLA, the sentence-level sensing data can be generated from the word-level data automatically, which can be then applied to train ASL systems. Moreover, GASLA has a clear interface to be integrated to existing ASL systems for overhead reduction directly. With this ability, sign language translation could become highly lightweight in both initial setup and future new-sentence addition. Compared with around 10 per-sentence data samples in current systems, GASLA requires 2-3 samples to achieve a similar performance.

Tackling Multipath and Biased Training Data for IMU-Assisted BLE Proximity Detection

Tianlang He and Jiajie Tan (The Hong Kong University of Science and Technology, China); Steve Zhuo (HKUST, Hong Kong); Maximilian Printz and S.-H. Gary Chan (The Hong Kong University of Science and Technology, China)

Proximity detection determines whether an IoT receiver is within a certain distance from a signal transmitter. Due to its low cost and popularity, we consider Bluetooth low energy (BLE) for proximity detection based on received signal strength indicator (RSSI). Because RSSI can be markedly influenced by device carriage states, previous works attempted to address it with inertial measurement unit (IMU) using deep learning. However, they have not sufficiently accounted for RSSI fluctuation due to multipath. Furthermore, the IMU training data may be biased, which hampers the system's robustness and generalizability. Such issue has not been considered before.

We propose PRID, an IMU-assisted BLE proximity detection approach robust against RSSI fluctuation and IMU data bias. PRID histogramizes RSSI to extract multipath features and uses carriage state regularization to mitigate overfitting upon IMU data bias. We further propose PRID-lite based on binarized neural network to cut memory requirement for resource-constrained devices. We have conducted extensive experiments under different multipath environments and data bias levels, and a crowdsourced dataset. Our results show that PRID reduces over 50% false detection cases compared with the existing arts. PRID-lite reduces over 90% PRID model size and extends 60% battery life, with minor compromise on accuracy (7%).

VR Viewport Pose Model for Quantifying and Exploiting Frame Correlations

Ying Chen and Hojung Kwon (Duke University, USA); Hazer Inaltekin (Macquarie University, Australia); Maria Gorlatova (Duke University, USA)

The importance of the dynamics of the viewport pose, i.e., location and orientation of users' points of view, for virtual reality (VR) experiences calls for the development of VR viewport pose models. In this paper, informed by our experimental measurements of viewport trajectories in 3 VR games and across 3 different types of VR interfaces, we first develop a statistical model of viewport poses in VR environments. Based on the developed model, we examine the correlations between pixels in VR frames that correspond to different viewport poses, and obtain an analytical expression for the visibility similarity (ViS) of the pixels across different VR frames. We then propose a lightweight ViS-based ALG-ViS algorithm that adaptively splits VR frames into background and foreground, reusing the background across different frames. Our implementation of ALG-ViS in two Oculus Quest 2 rendering systems demonstrates ALG-ViS running in real time, supporting the full VR frame rate, and outperforming baselines on measures of frame quality and bandwidth consumption.

Session Chair

Chuyu Wang (Nanjing University)

Program at a Glance